Services for Data Access and Data Processing on Grids

نویسنده

  • Inderpal Narang
چکیده

An increasing number of grid applications manage data at very large scale, of both size and distribution. In this paper we discuss data access and data processing services for such applications, in the context of a grid. The complexity of data management on a grid arises from the scale, dynamism, autonomy, and distribution of data sources. The main argument of this paper is that these complexities should be made transparent to grid applications, through a layer of virtualization services. We start by discussing the various dimensions of transparent data access and processing, and illustrate their benefits in the context of a specific application. We then present a layer of grid data virtualization services that provide such transparency and enable ease of data access and processing. These services support federated access to distributed data, dynamic discovery of data sources by content, dynamic migration of data for workload balancing, parallel data processing, and collaboration. We describe both our long-term vision for these services and a concrete proposal for what is achievable in the near term. We also discuss some support that grid data sources can provide to enable efficient virtualization. GFD-I.14 Vijayshankar Raman, IBM Almaden Research Center DAIS Working Group Inderpal Narang, IBM Almaden Research Center Chris Crone, IBM Silicon Valley Lab Laura Haas, IBM Silicon Valley Lab Susan Malaika, IBM Silicon Valley Lab Tina Mukai, IBM Silicon Valley Lab Dan Wolfson, IBM Silicon Valley Lab Chaitan Baru, San Diego Supercomputer Center

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving Mobile Grid Performance Using Fuzzy Job Replica Count Determiner

Grid computing is a term referring to the combination of computer resources from multiple administrative domains to reach a common computational platform. Mobile Computing is a Generic word that introduces using of movable, handheld devices with wireless communication, for processing data. Mobile Computing focused on providing access to data, information, services and communications anywhere an...

متن کامل

Improving Mobile Grid Performance Using Fuzzy Job Replica Count Determiner

Grid computing is a term referring to the combination of computer resources from multiple administrative domains to reach a common computational platform. Mobile Computing is a Generic word that introduces using of movable, handheld devices with wireless communication, for processing data. Mobile Computing focused on providing access to data, information, services and communications anywhere an...

متن کامل

Improving Data Grids Performance by Using Modified Dynamic Hierarchical Replication Strategy

Abstract: A Data Grid connects a collection of geographically distributed computational and storage resources that enables users to share data and other resources. Data replication, a technique much discussed by Data Grid researchers in recent years creates multiple copies of file and places them in various locations to shorten file access times. In this paper, a dynamic data replication strate...

متن کامل

A Federated Grid Environment with Replication Services

Grids can be classified as computational grids, access grids and data grids. Computational grids address applications that deal with complex and time intensive computational problems, usually on relatively small datasets. Access grids focus on group-to-group communication. Whereas data grids address the needs of applications that deal with the evaluation and mining of large amounts of data in t...

متن کامل

E2DR: Energy Efficient Data Replication in Data Grid

Abstract— Data grids are an important branch of gird computing which provide mechanisms for the management of large volumes of distributed data. Energy efficiency has recently emerged as a hot topic in large distributed systems. The development of computing systems is traditionally focused on performance improvements driven by the demand of client's applications in scientific and business domai...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003